Analysis and prediction of RNA-binding residues using sequence, evolutionary conservation, and predicted secondary structure and solvent accessibility.

نویسندگان

  • Tuo Zhang
  • Hua Zhang
  • Ke Chen
  • Jishou Ruan
  • Shiyi Shen
  • Lukasz Kurgan
چکیده

Identification and prediction of RNA-binding residues (RBRs) provides valuable insights into the mechanisms of protein-RNA interactions. We analyzed the contributions of a wide range of factors including amino acid sequence, evolutionary conservation, secondary structure and solvent accessibility, to the prediction/characterization of RBRs. Five feature sets were designed and feature selection was performed to find and investigate relevant features. We demonstrate that (1) interactions with positively charged amino acids Arg and Lys are preferred by the egatively charged nucleotides; (2) Gly provides flexibility for the RNA binding sites; (3) Glu with negatively charged side chain and several hydrophobic residues such as Leu, Val, Ala and Phe are disfavored in the RNA-binding sites; (4) coil residues, especially in long segments, are more flexible (than other secondary structures) and more likely to interact with RNA; (5) helical residues are more rigid and consequently they are less likely to bind RNA; and (6) residues partially exposed to the solvent are more likely to form RNA-binding sites. We introduce a novel sequence-based predictor of RBRs, RBRpred, which utilizes the selected features. RBRpred is comprehensively tested on three datasets with varied atom distance cutoffs by performing both five-fold cross validation and jackknife tests and achieves Matthew's correlation coefficient (MCC) of 0.51, 0.48 and 0.42, respectively. The quality is comparable to or better than that for state-of-the-art predictors that apply the distancebased cutoff definition. We show that the most important factor for RBRs prediction is evolutionary conservation, followed by the amino acid sequence, predicted secondary structure and predicted solvent accessibility. We also investigate the impact of using native vs. predicted secondary structure and solvent accessibility. The predictions are sufficient for the RBR prediction and the knowledge of the actual solvent accessibility helps in predictions for lower distance cutoffs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of protein functional residues from sequence by probability density estimation

MOTIVATION The prediction of ligand-binding residues or catalytically active residues of a protein may give important hints that can guide further genetic or biochemical studies. Existing sequence-based prediction methods mostly rank residue positions by evolutionary conservation calculated from a multiple sequence alignment of homologs. A problem hampering more wide-spread application of these...

متن کامل

Phylogenetic Analysis of Beta-Glucanase Producing Actinomycetes Strain TBG-CH22 - A Comparison of Conventional and Molecular Morphometric Approach

Actinomycetes are inexhaustible producers of commercially valuable metabolites, are continually screened for beneficial compounds. The taxonomic and phylogenetic study of novel actinomycetes strains are mostly based on conventional methods and primary DNA structure of 16s rRNA. Although 16s rRNA sequence is well accepted in phylogeny studies, its secondary structures have not been widely used. ...

متن کامل

Phylogenetic Analysis of Beta-Glucanase Producing Actinomycetes Strain TBG-CH22 - A Comparison of Conventional and Molecular Morphometric Approach

Actinomycetes are inexhaustible producers of commercially valuable metabolites, are continually screened for beneficial compounds. The taxonomic and phylogenetic study of novel actinomycetes strains are mostly based on conventional methods and primary DNA structure of 16s rRNA. Although 16s rRNA sequence is well accepted in phylogeny studies, its secondary structures have not been widely used. ...

متن کامل

Evolutionary Analysis of Mammalian ACE2 and the Key Residues Involved in Binding to the Spike Protein Revealed Potential SARS-CoV-2 Hosts

Introduction: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spilled over to humans via wild mammals, entering the host cell using angiotensin-converting enzyme 2 (ACE2) as receptor through Spike (S) protein binding. While SARS-CoV-2 became fully adapted to humans and globally spread, some mammal species were infected back. The present study evaluated the potential risk of mammals...

متن کامل

Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure.

The present study is an attempt to develop a neural network-based method for predicting the real value of solvent accessibility from the sequence using evolutionary information in the form of multiple sequence alignment. In this method, two feed-forward networks with a single hidden layer have been trained with standard back-propagation as a learning algorithm. The Pearson's correlation coeffic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Current protein & peptide science

دوره 11 7  شماره 

صفحات  -

تاریخ انتشار 2010